# Multi-task distillation
Deepseek R1 Distill Qwen 32B Unsloth Bnb 4bit
Apache-2.0
DeepSeek-R1 is the first-generation inference model launched by the DeepSeek team. Through large-scale reinforcement learning training, it does not require supervised fine-tuning (SFT) as an initial step and demonstrates excellent inference capabilities.
Large Language Model
Transformers English

D
unsloth
938
10
Xtremedistil L12 H384 Uncased
MIT
XtremeDistilTransformers is a task-agnostic Transformer model distilled through task transfer learning, creating a small universal model applicable to any task and language.
Large Language Model
Transformers English

X
microsoft
471
15
Featured Recommended AI Models